207 research outputs found
Gradient-free activation maximization for identifying effective stimuli
A fundamental question for understanding brain function is what types of
stimuli drive neurons to fire. In visual neuroscience, this question has also
been posted as characterizing the receptive field of a neuron. The search for
effective stimuli has traditionally been based on a combination of insights
from previous studies, intuition, and luck. Recently, the same question has
emerged in the study of units in convolutional neural networks (ConvNets), and
together with this question a family of solutions were developed that are
generally referred to as "feature visualization by activation maximization."
We sought to bring in tools and techniques developed for studying ConvNets to
the study of biological neural networks. However, one key difference that
impedes direct translation of tools is that gradients can be obtained from
ConvNets using backpropagation, but such gradients are not available from the
brain. To circumvent this problem, we developed a method for gradient-free
activation maximization by combining a generative neural network with a genetic
algorithm. We termed this method XDream (EXtending DeepDream with real-time
evolution for activation maximization), and we have shown that this method can
reliably create strong stimuli for neurons in the macaque visual cortex (Ponce
et al., 2019). In this paper, we describe extensive experiments characterizing
the XDream method by using ConvNet units as in silico models of neurons. We
show that XDream is applicable across network layers, architectures, and
training sets; examine design choices in the algorithm; and provide practical
guides for choosing hyperparameters in the algorithm. XDream is an efficient
algorithm for uncovering neuronal tuning preferences in black-box networks
using a vast and diverse stimulus space.Comment: 16 pages, 8 figures, 3 table
Unsupervised Learning of Visual Structure using Predictive Generative Networks
The ability to predict future states of the environment is a central pillar
of intelligence. At its core, effective prediction requires an internal model
of the world and an understanding of the rules by which the world changes.
Here, we explore the internal models developed by deep neural networks trained
using a loss based on predicting future frames in synthetic video sequences,
using a CNN-LSTM-deCNN framework. We first show that this architecture can
achieve excellent performance in visual sequence prediction tasks, including
state-of-the-art performance in a standard 'bouncing balls' dataset (Sutskever
et al., 2009). Using a weighted mean-squared error and adversarial loss
(Goodfellow et al., 2014), the same architecture successfully extrapolates
out-of-the-plane rotations of computer-generated faces. Furthermore, despite
being trained end-to-end to predict only pixel-level information, our
Predictive Generative Networks learn a representation of the latent structure
of the underlying three-dimensional objects themselves. Importantly, we find
that this representation is naturally tolerant to object transformations, and
generalizes well to new tasks, such as classification of static images. Similar
models trained solely with a reconstruction loss fail to generalize as
effectively. We argue that prediction can serve as a powerful unsupervised loss
for learning rich internal representations of high-level object features.Comment: under review as conference paper at ICLR 201
Theory on the Coupled Stochastic Dynamics of Transcription and Splice-Site Recognition
Eukaryotic genes are typically split into exons that need to be spliced together to form the mature mRNA. The splicing process depends on the dynamics and interactions among transcription by the RNA polymerase II complex (RNAPII) and the spliceosomal complex consisting of multiple small nuclear ribonucleo proteins (snRNPs). Here we propose a biophysically plausible initial theory of splicing that aims to explain the effects of the stochastic dynamics of snRNPs on the splicing patterns of eukaryotic genes. We consider two different ways to model the dynamics of snRNPs: pure three-dimensional diffusion and a combination of three- and one-dimensional diffusion along the emerging pre-mRNA. Our theoretical analysis shows that there exists an optimum position of the splice sites on the growing pre-mRNA at which the time required for snRNPs to find the 5β² donor site is minimized. The minimization of the overall search time is achieved mainly via the increase in non-specific interactions between the snRNPs and the growing pre-mRNA. The theory further predicts that there exists an optimum transcript length that maximizes the probabilities for exons to interact with the snRNPs. We evaluate these theoretical predictions by considering human and mouse exon microarray data as well as RNAseq data from multiple different tissues. We observe that there is a broad optimum position of splice sites on the growing pre-mRNA and an optimum transcript length, which are roughly consistent with the theoretical predictions. The theoretical and experimental analyses suggest that there is a strong interaction between the dynamics of RNAPII and the stochastic nature of snRNP search for 5β² donor splicing sites
Finding any Waldo: zero-shot invariant and efficient visual search
Visual search constitutes a ubiquitous challenge in natural vision, including daily tasks such as finding a friend in a crowd or searching for a car in a parking lot. Visual search must fulfill four key properties: selectivity (to distinguish the target from distractors in a cluttered scene), invariance (to localize the target despite changes in its rotation, scale, illumination, and even searching for generic object categories), speed (to efficiently localize the target without exhaustive sampling), and generalization (to search for any object, even ones that we have had minimal or no experience with). Here we propose a computational model that is directly inspired by neurophysiological recordings during visual search in macaque monkeys, which maps the discriminative power from object recognition models to the problem of visual search. The model takes two inputs, a target object, and a search image, and produces a sequence of fixations. The model consists of a deep convolutional network that extracts features about the target object, stores those features, and uses those features in a top-down fashion to modulate the responses to the search image, thus generating a task-dependent saliency map. We show that the model fulfills the critical properties outlined above, distinguishing it from heuristic approaches such as template matching, random search, sliding windows, bottom-up saliency maps and object detection algorithms. Furthermore, we directly compare the model against human eye movement behavior during three increasingly more complex tasks where subjects have to search for a target object in a multi-object array image, in natural scenes or in the well-known Waldo search task. We show that the model provides a reasonable first-order approximation to human behavior and can efficiently find targets in an invariant manner, without any training for the target objects
Depression-Biased Reverse Plasticity Rule Is Required for Stable Learning at Top-down Connections
Top-down synapses are ubiquitous throughout neocortex and play a central role in cognition, yet little is known about their development and specificity. During sensory experience, lower neocortical areas are activated before higher ones, causing top-down synapses to experience a preponderance of post-synaptic activity preceding pre-synaptic activity. This timing pattern is the opposite of that experienced by bottom-up synapses, which suggests that different versions of spike-timing dependent synaptic plasticity (STDP) rules may be required at top-down synapses. We consider a two-layer neural network model and investigate which STDP rules can lead to a distribution of top-down synaptic weights that is stable, diverse and avoids strong loops. We introduce a temporally reversed rule (rSTDP) where top-down synapses are potentiated if post-synaptic activity precedes pre-synaptic activity. Combining analytical work and integrate-and-fire simulations, we show that only depression-biased rSTDP (and not classical STDP) produces stable and diverse top-down weights. The conclusions did not change upon addition of homeostatic mechanisms, multiplicative STDP rules or weak external input to the top neurons. Our prediction for rSTDP at top-down synapses, which are distally located, is supported by recent neurophysiological evidence showing the existence of temporally reversed STDP in synapses that are distal to the post-synaptic cell body
A role for recurrent processing in object completion: neurophysiological, psychophysical and computational"evidence
Recognition of objects from partial information presents a significant
challenge for theories of vision because it requires spatial integration and
extrapolation from prior knowledge. We combined neurophysiological recordings
in human cortex with psychophysical measurements and computational modeling to
investigate the mechanisms involved in object completion. We recorded
intracranial field potentials from 1,699 electrodes in 18 epilepsy patients to
measure the timing and selectivity of responses along human visual cortex to
whole and partial objects. Responses along the ventral visual stream remained
selective despite showing only 9-25% of the object. However, these visually
selective signals emerged ~100 ms later for partial versus whole objects. The
processing delays were particularly pronounced in higher visual areas within
the ventral stream, suggesting the involvement of additional recurrent
processing. In separate psychophysics experiments, disrupting this recurrent
computation with a backward mask at ~75ms significantly impaired recognition of
partial, but not whole, objects. Additionally, computational modeling shows
that the performance of a purely bottom-up architecture is impaired by heavy
occlusion and that this effect can be partially rescued via the incorporation
of top-down connections. These results provide spatiotemporal constraints on
theories of object recognition that involve recurrent processing to recognize
objects from partial information
- β¦